-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Deduplicate block root computation #3590
Conversation
This is also part of the consensus context on |
Sounds reasonable. This does a fair amount of work in the networking stack to remove duplicate hashes there too, does the consensus context extend down into the networking side of things? |
I can figure that out myself! I'm going to start playing with tree-states now :) |
As you may have already discovered, it does not do any magic in networking land 😅 Maybe we can merge this PR almost as-is and I can open a follow-up PR with the consensus context. I've been thinking it would be cool to have a bunch of per-slot and per-epoch caches that get attached to the context to accelerate various things. And we could build these caches off the hot path of block verification by doing it for the next slot after the processing of a new block (maybe in state advance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked this thoroughly and I'm satisfied that it's correct. Excited to improve our block processing times further 🚀
This optimisation has the nice property that it's quite hard to screw up, because the block makes no claim as to its own block root, so if we have a block root available then it must have been computed from the block. The only risk would be using tree_hash_root()
on the signed beacon block rather than canonical_root()
, and I've checked that this PR also doesn't do that.
Let's merge!
bors r+ |
## Issue Addressed NA ## Proposed Changes This PR removes duplicated block root computation. Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`. Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations *after* the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times. ## Additional Info NA
## Issue Addressed NA ## Proposed Changes This PR removes duplicated block root computation. Computing the `SignedBeaconBlock::canonical_root` has become more expensive since the merge as we need to compute the merke root of each transaction inside an `ExecutionPayload`. Computing the root for [a mainnet block](https://beaconcha.in/slot/4704236) is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations *after* the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times. ## Additional Info NA
Issue Addressed
NA
Proposed Changes
This PR removes duplicated block root computation.
Computing the
SignedBeaconBlock::canonical_root
has become more expensive since the merge as we need to compute the merke root of each transaction inside anExecutionPayload
.Computing the root for a mainnet block is taking ~10ms on my i7-8700K CPU @ 3.70GHz (no sha extensions). Given that our median seen-to-imported time for blocks is presently 300-400ms, removing a few duplicated block roots (~30ms) could represent an easy 10% improvement. When we consider that the seen-to-imported times include operations after the block has been placed in the early attester cache, we could expect the 30ms to be more significant WRT our seen-to-attestable times.
Additional Info
NA